 |
"Training for Optical Character Recognition and Khmer Language Processing" |
|
|
 |
Overview |
 |
 |
|
Cambodia component of PAN Localization project successfully built preliminary
language processing applications in Phase I of project. It was time to move ahead and develop
some more intricate tools for Khmer NLP e.g. Text to Speech system, Optical Character Recognition
and Mobile SMS. But certain training was required for the team to design such applications.
|
|
|
Unluckily the senior members of PAN team went for higher education at the same time and were
replaced by relatively inexperienced and fresh team.
Although all of the new team members of PAN Cambodia were qualified computer scientists
but they didn’t have much experience of NLP application development.
It took a while for them to understand the working environment,
and especially for more complex computing. |
|
 |
|
|
 |
Objectives |
 |
 |
|
The prime objective of the training was to equip
Cambodia team with basic local language computing technologies. It was also required to
guide them in fulfilling their commitments to the project and finish the work in time.
Team received the training starting from basic programming skills and leading towards
advanced application development. Broadly training was comprised of following objectives:
-
Khmer OCR
-
Open office plug-ins of Khmer applications
-
Khmer collation support for MySQL
-
Automatic POS tagger for Khmer
-
Khmer Lexicon development
-
Khmer Tagged Corpus
-
Khmer SMS software for Java based mobile phones
-
Workshops on Khmer Application training for Students and University Staff
Following tasks were
planned to help the team in achieving the objectives of PAN
Localization project:
Technical Tasks:
-
Programming skills development in C++
-
Fundamentals of Digital Image Processing
-
Modular understanding of OCR system
-
Introduction to Artificial Intelligence
-
Open Office embedding
-
Part of Speech Tagging techniques
-
Collation embedding in MySQL
-
Development on Mobile Platform
Non Technical Tasks:
|
|
 |
|
|
 |
Challenges |
 |
 |
|
The prime challenge, faced during the training,
were the sustainability of work and capacity building of human resources. The PAN team of
phase I was well versed for language processing applications but there was not much
overlapping time for the new team to absorb some skills. In addition to that Image
processing was a new filed for those fresh developers and also they needed some warm
up exercises for robust error free programming. It took a while (First 1 and a half
month of training) for them to be hands on with image handling and artificial
intelligence processing. |
|
 |
|
|
 |
Trainer Profile |
 |
 |
|
Trainer Name: Ahmed Muaz
Ahmed Muaz is a graduate of computer science (2007)
from FAST-NU, Lahore and currently enrolled in Master program. His area of specialization is
Natural Language Processing. He has been part of PAN Localization regional secretariat team
since 2005, and worked on different sub-projects. Currently he is serving as Associate
Development Engineer in CRULP. |
|
 |
|
|
 |
Training Participants |
 |
 |
|
Ms. Khem Sochenda
Ms. Sophea Vann
Mr. Ing LengIeng
Mr. Tith Sakal
Mr. Oudom Keo
Mr. Sovathena Neth
Mr. Visal
|
|
 |
|
|
 |
Discussion
and Conclusion |
 |
 |
|
Training was very successful as it achieved
all planned goals. In success of the training significant role was played by Cambodia
Country Leader (Mr. Chea Sok Huor) who facilitated team and trainer with all required
resources. All planned objectives were achieved. Newly hired team lead and manager
were guided and trained to accomplish tasks in time. During the later half of training,
one seminar and two workshops were conducted as well.
|
|
 |
|
[Picture Gallery] |